AITopics

#artificialintelligenceJun-1-2022, 18:20:51 GMT

Dimensionality Reduction: Principal Component Analysis

A dataset is made up of a number of features. As long as these features are related in someway to the target and are optimal in number a machine learning model will be able to produce decent results after learning from the data. But if the number of features are high and most of the features do not contribute towards the model's learning then the performance of the model will go down and the time taken to output predictions also increases. The process of reducing the number of dimensions by transforming the original feature space into a subspace is one method of performing dimensionality reduction and Principal Component Analysis (PCA) does this. So let's take a look into the building concepts of PCA.

eigen value, eigen vector, vector, (9 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.63)

#artificialintelligenceSep-26-2021, 00:10:15 GMT

Computer Vision and Deep Learning -Part 4

FAST will not perform well where detection of multiple features has to be performed in same region of an image. For this Non-Maximum Suppression is used. In Non-Maximum Suppression a score function is computed, V for all the detected feature points. In a nut shell, FAST is faster than many existing feature detectors but performs poorly in presence of high level of noise. Mainly because the pixel values will be altered because of high-level of noise. Opencv documentation mentions two feature matching methods.

deep learning -part 4, intensity, variation, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

#artificialintelligenceAug-24-2021, 05:40:08 GMT

PCA on HyperSpectral Data

The Hyperspectral data expands the capability of Image Classification. The Hyperspectral Data not only distinguishes different land cover types but it also provides the detailed characteristics of each land cover such as minerals, soil, man-made structures (buildings, roads, etc.) and vegetation types. While dealing with the HyperSpectral data one disadvantage is that there are too many bands to process. Apart from that, it is a challenge to store such a large amount of data. With a large amount of data, the time complexity also increases.

hyperspectral data, information, pca, (5 more...)

Technology: Information Technology > Artificial Intelligence > Vision (0.38)

arXiv.org Machine LearningSep-9-2020

Consistency and Regression with Laplacian regularization in Reproducing Kernel Hilbert Space

Cabannes, Vivien

This note explained a way to look at reproducing kernel Hilbert space for regression problems. It consists in expressing kernel regresssion solutions with simple integral operators algebra, which we can approximate consistently from empirical data, providing the corresponding estimators of the solutions. Let's consider the classical regression problem arg min ‖f(x) y‖ In practice we are going to restrict the search for a solution f F, over a simpler function space f H. Let's associate to it the canonical RKHS, see Aronszajn (1950) H It is good to find function f from X to R, but what if Y is a real Hilbert space. Indeed, it is natural to extend the theory of RKHS to vector valued functions Schwartz (1964). Once again we can build an Hilbert space of functions from X to Y, let's first define γ Those are going to be the building element of H Definition 1 (The RKHS H).

artificial intelligence, machine learning, operator, (13 more...)

2009.04324

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceAug-31-2020, 20:16:16 GMT

Risks and Caution on applying PCA for Supervised Learning Problems

The curse of dimensionality is a very crucial problem while dealing with real-life datasets which are generally higher-dimensional data. As the dimensionality of the feature space increases, the number of configurations can grow exponentially, and thus the number of configurations covered by an observation decreases. In such a scenario, Principal Component Analysis plays a major part in efficiently reducing the dimensionality of the data yet retaining as much as possible of the variation present in the data set. Let us give a very brief introduction to Principal Component Analysis before delving into the actual problem. The central idea of Principal Component Analysis (PCA) is to reduce the dimensionality of a data set consisting of a large number of correlated variables, while retaining the maximum possible variation present in the data set.

artificial intelligence, machine learning, principal component, (14 more...)

Country:

North America > United States > New York (0.05)
North America > United States > Massachusetts > Middlesex County > Reading (0.05)

Genre: Research Report (0.31)

Industry: Education > Focused Education > Special Education (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.69)

Joukov, Vladimir, Kulić, Dana

Fast Approximate Multi-output Gaussian Processes

arXiv.org Machine LearningAug-22-2020

Gaussian processes regression models are an appealing machine learning method as they learn expressive non-linear models from exemplar data with minimal parameter tuning and estimate both the mean and covariance of unseen points. However, exponential computational complexity growth with the number of training samples has been a long standing challenge. During training, one has to compute and invert an $N \times N$ kernel matrix at every iteration. Regression requires computation of an $m \times N$ kernel where $N$ and $m$ are the number of training and test points respectively. In this work we show how approximating the covariance kernel using eigenvalues and functions leads to an approximate Gaussian process with significant reduction in training and regression complexity. Training with the proposed approach requires computing only a $N \times n$ eigenfunction matrix and a $n \times n$ inverse where $n$ is a selected number of eigenvalues. Furthermore, regression now only requires an $m \times n$ matrix. Finally, in a special case the hyperparameter optimization is completely independent form the number of training samples. The proposed method can regress over multiple outputs, estimate the derivative of the regressor of any order, and learn the correlations between them. The computational complexity reduction, regression capabilities, and multioutput correlation learning are demonstrated in simulation examples.

artificial intelligence, kernel, machine learning, (18 more...)

2008.09848

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Patel, Devshree, Raval, Param, Parikh, Ratnam, Shastri, Yesha

Comparative Study of Machine Learning Models and BERT on SQuAD

arXiv.org Machine LearningMay-22-2020

This study aims to provide a comparative analysis of performance of certain models popular in machine learning and the BERT model on the Stanford Question Answering Dataset (SQuAD). The analysis shows that the BERT model, which was once state-of-the-art on SQuAD, gives higher accuracy in comparison to other models. However, BERT requires a greater execution time even when only 100 samples are used. This shows that with increasing accuracy more amount of time is invested in training the data. Whereas in case of preliminary machine learning models, execution time for full data is lower but accuracy is compromised.

artificial intelligence, machine learning, representation, (13 more...)

2005.11313

Genre: Research Report (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Sahoo, Saswata, Chakraborty, Souradip

Graph Spectral Feature Learning for Mixed Data of Categorical and Numerical Type

arXiv.org Machine LearningMay-6-2020

Feature learning in the presence of a mixed type of variables, numerical and categorical types, is an important issue for related modeling problems. For simple neighborhood queries under mixed data space, standard practice is to consider numerical and categorical variables separately and combining them based on some suitable distance functions. Alternatives, such as Kernel learning or Principal Component do not explicitly consider the inter-dependence structure among the mixed type of variables. In this work, we propose a novel strategy to explicitly model the probabilistic dependence structure among the mixed type of variables by an undirected graph. Spectral decomposition of the graph Laplacian provides the desired feature transformation. The Eigen spectrum of the transformed feature space shows increased separability and more prominent clusterability among the observations. The main novelty of our paper lies in capturing interactions of the mixed feature type in an unsupervised framework using a graphical model. We numerically validate the implications of the feature learning strategy

artificial intelligence, dataset, machine learning, (16 more...)

2005.02817

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.04)
Asia > Singapore (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

P., Satya Jayadev, Narasimhan, Shankar, Bhatt, Nirav

Learning Conserved Networks from Flows

arXiv.org Machine LearningMay-21-2019

The network reconstruction problem is one of the challenging problems in network science. This work deals with reconstructing networks in which the flows are conserved around the nodes. These networks are referred to as conserved networks. We propose a novel concept of conservation graph for describing conserved networks. The properties of conservation graph are investigated. We develop a methodology to reconstruct conserved networks from flows by combining these graph properties with learning techniques, with polynomial time complexity. We show that exact network reconstruction is possible for radial networks. Further, we extend the methodology for reconstructing networks from noisy data. We demonstrate the proposed methods on different types of radial networks.

artificial intelligence, machine learning, matrix, (18 more...)

1905.08716

Country:

Asia > India > Tamil Nadu > Chennai (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.85)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)